Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Barcode identification for single cell genomics

Identifieur interne : 000683 ( Main/Exploration ); précédent : 000682; suivant : 000684

Barcode identification for single cell genomics

Auteurs : Akshay Tambe [États-Unis] ; Lior Pachter [États-Unis]

Source :

RBID : PMC:6337828

Descripteurs français

English descriptors

Abstract

Background

Single-cell sequencing experiments use short DNA barcode ‘tags’ to identify reads that originate from the same cell. In order to recover single-cell information from such experiments, reads must be grouped based on their barcode tag, a crucial processing step that precedes other computations. However, this step can be difficult due to high rates of mismatch and deletion errors that can afflict barcodes.

Results

Here we present an approach to identify and error-correct barcodes by traversing the de Bruijn graph of circularized barcode k-mers. Our approach is based on the observation that circularizing a barcode sequence can yield error-free k-mers even when the size of k is large relative to the length of the barcode sequence, a regime which is typical single-cell barcoding applications. This allows for assignment of reads to consensus fingerprints constructed from k-mers.

Conclusion

We show that for single-cell RNA-Seq circularization improves the recovery of accurate single-cell transcriptome estimates, especially when there are a high number of errors per read. This approach is robust to the type of error (mismatch, insertion, deletion), as well as to the relative abundances of the cells. Sircel, a software package that implements this approach is described and publically available.

Electronic supplementary material

The online version of this article (10.1186/s12859-019-2612-0) contains supplementary material, which is available to authorized users.


Url:
DOI: 10.1186/s12859-019-2612-0
PubMed: 30654736
PubMed Central: 6337828


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Barcode identification for single cell genomics</title>
<author>
<name sortKey="Tambe, Akshay" sort="Tambe, Akshay" uniqKey="Tambe A" first="Akshay" last="Tambe">Akshay Tambe</name>
<affiliation wicri:level="2">
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000000107068890</institution-id>
<institution-id institution-id-type="GRID">grid.20861.3d</institution-id>
<institution>Division of Biology and Biological Engineering,</institution>
<institution>California Institute of Technology,</institution>
</institution-wrap>
116 Kerckhoff Laboratory, Pasadena, CA 91125 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Californie</region>
</placeName>
<wicri:cityArea>116 Kerckhoff Laboratory, Pasadena</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Pachter, Lior" sort="Pachter, Lior" uniqKey="Pachter L" first="Lior" last="Pachter">Lior Pachter</name>
<affiliation wicri:level="2">
<nlm:aff id="Aff2">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000000107068890</institution-id>
<institution-id institution-id-type="GRID">grid.20861.3d</institution-id>
<institution>Departments of Biology and Computing & Mathematical Sciences,</institution>
<institution>California Institute of Technology,</institution>
</institution-wrap>
116 Kerckhoff Laboratory, Pasadena, CA 91125 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Californie</region>
</placeName>
<wicri:cityArea>116 Kerckhoff Laboratory, Pasadena</wicri:cityArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">30654736</idno>
<idno type="pmc">6337828</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6337828</idno>
<idno type="RBID">PMC:6337828</idno>
<idno type="doi">10.1186/s12859-019-2612-0</idno>
<date when="2019">2019</date>
<idno type="wicri:Area/Pmc/Corpus">000278</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000278</idno>
<idno type="wicri:Area/Pmc/Curation">000278</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000278</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000390</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000390</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:30654736</idno>
<idno type="wicri:Area/PubMed/Corpus">000667</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000667</idno>
<idno type="wicri:Area/PubMed/Curation">000667</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000667</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000657</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000657</idno>
<idno type="wicri:Area/Ncbi/Merge">002084</idno>
<idno type="wicri:Area/Ncbi/Curation">002084</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">002084</idno>
<idno type="wicri:Area/Main/Merge">000686</idno>
<idno type="wicri:Area/Main/Curation">000683</idno>
<idno type="wicri:Area/Main/Exploration">000683</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Barcode identification for single cell genomics</title>
<author>
<name sortKey="Tambe, Akshay" sort="Tambe, Akshay" uniqKey="Tambe A" first="Akshay" last="Tambe">Akshay Tambe</name>
<affiliation wicri:level="2">
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000000107068890</institution-id>
<institution-id institution-id-type="GRID">grid.20861.3d</institution-id>
<institution>Division of Biology and Biological Engineering,</institution>
<institution>California Institute of Technology,</institution>
</institution-wrap>
116 Kerckhoff Laboratory, Pasadena, CA 91125 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Californie</region>
</placeName>
<wicri:cityArea>116 Kerckhoff Laboratory, Pasadena</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Pachter, Lior" sort="Pachter, Lior" uniqKey="Pachter L" first="Lior" last="Pachter">Lior Pachter</name>
<affiliation wicri:level="2">
<nlm:aff id="Aff2">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000000107068890</institution-id>
<institution-id institution-id-type="GRID">grid.20861.3d</institution-id>
<institution>Departments of Biology and Computing & Mathematical Sciences,</institution>
<institution>California Institute of Technology,</institution>
</institution-wrap>
116 Kerckhoff Laboratory, Pasadena, CA 91125 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Californie</region>
</placeName>
<wicri:cityArea>116 Kerckhoff Laboratory, Pasadena</wicri:cityArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2019">2019</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>DNA (genetics)</term>
<term>Genomics (methods)</term>
<term>High-Throughput Nucleotide Sequencing (methods)</term>
<term>Humans</term>
<term>Sequence Analysis, DNA (methods)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>ADN (génétique)</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Génomique ()</term>
<term>Humains</term>
<term>Séquençage nucléotidique à haut débit ()</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="genetics" xml:lang="en">
<term>DNA</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr">
<term>ADN</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Humans</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Analyse de séquence d'ADN</term>
<term>Génomique</term>
<term>Humains</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p id="Par1">Single-cell sequencing experiments use short DNA barcode ‘tags’ to identify reads that originate from the same cell. In order to recover single-cell information from such experiments, reads must be grouped based on their barcode tag, a crucial processing step that precedes other computations. However, this step can be difficult due to high rates of mismatch and deletion errors that can afflict barcodes.</p>
</sec>
<sec>
<title>Results</title>
<p id="Par2">Here we present an approach to identify and error-correct barcodes by traversing the de Bruijn graph of circularized barcode k-mers. Our approach is based on the observation that circularizing a barcode sequence can yield error-free k-mers even when the size of
<italic>k</italic>
is large relative to the length of the barcode sequence, a regime which is typical single-cell barcoding applications. This allows for assignment of reads to consensus fingerprints constructed from k-mers.</p>
</sec>
<sec>
<title>Conclusion</title>
<p id="Par3">We show that for single-cell RNA-Seq circularization improves the recovery of accurate single-cell transcriptome estimates, especially when there are a high number of errors per read. This approach is robust to the type of error (mismatch, insertion, deletion), as well as to the relative abundances of the cells. Sircel, a software package that implements this approach is described and publically available.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (10.1186/s12859-019-2612-0) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Bray, Nl" uniqKey="Bray N">NL Bray</name>
</author>
<author>
<name sortKey="Pimentel, H" uniqKey="Pimentel H">H Pimentel</name>
</author>
<author>
<name sortKey="Melsted, P" uniqKey="Melsted P">P Melsted</name>
</author>
<author>
<name sortKey="Pachter, L" uniqKey="Pachter L">L Pachter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Compeau, Pec" uniqKey="Compeau P">PEC Compeau</name>
</author>
<author>
<name sortKey="Pevzner, Pa" uniqKey="Pevzner P">PA Pevzner</name>
</author>
<author>
<name sortKey="Tesler, G" uniqKey="Tesler G">G Tesler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fincher, Ct" uniqKey="Fincher C">CT Fincher</name>
</author>
<author>
<name sortKey="Wurtzel, O" uniqKey="Wurtzel O">O Wurtzel</name>
</author>
<author>
<name sortKey="De Hoog, T" uniqKey="De Hoog T">T de Hoog</name>
</author>
<author>
<name sortKey="Kravarik, Km" uniqKey="Kravarik K">KM Kravarik</name>
</author>
<author>
<name sortKey="Reddien, Pw" uniqKey="Reddien P">PW Reddien</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gierahn, Tm" uniqKey="Gierahn T">TM Gierahn</name>
</author>
<author>
<name sortKey="Wadsworth, Mh" uniqKey="Wadsworth M">MH Wadsworth</name>
</author>
<author>
<name sortKey="Hughes, Tk" uniqKey="Hughes T">TK Hughes</name>
</author>
<author>
<name sortKey="Bryson, Bd" uniqKey="Bryson B">BD Bryson</name>
</author>
<author>
<name sortKey="Butler, A" uniqKey="Butler A">A Butler</name>
</author>
<author>
<name sortKey="Satija, R" uniqKey="Satija R">R Satija</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karaiskos, N" uniqKey="Karaiskos N">N Karaiskos</name>
</author>
<author>
<name sortKey="Wahle, P" uniqKey="Wahle P">P Wahle</name>
</author>
<author>
<name sortKey="Alles, J" uniqKey="Alles J">J Alles</name>
</author>
<author>
<name sortKey="Boltengagen, A" uniqKey="Boltengagen A">A Boltengagen</name>
</author>
<author>
<name sortKey="Ayoub, S" uniqKey="Ayoub S">S Ayoub</name>
</author>
<author>
<name sortKey="Kipar, C" uniqKey="Kipar C">C Kipar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Klein, Am" uniqKey="Klein A">AM Klein</name>
</author>
<author>
<name sortKey="Mazutis, L" uniqKey="Mazutis L">L Mazutis</name>
</author>
<author>
<name sortKey="Akartuna, I" uniqKey="Akartuna I">I Akartuna</name>
</author>
<author>
<name sortKey="Tallapragada, N" uniqKey="Tallapragada N">N Tallapragada</name>
</author>
<author>
<name sortKey="Veres, A" uniqKey="Veres A">A Veres</name>
</author>
<author>
<name sortKey="Li, V" uniqKey="Li V">V Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, Y" uniqKey="Liu Y">Y Liu</name>
</author>
<author>
<name sortKey="Schroder, J" uniqKey="Schroder J">J Schroder</name>
</author>
<author>
<name sortKey="Schmidt, B" uniqKey="Schmidt B">B Schmidt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Macosko, Ez" uniqKey="Macosko E">EZ Macosko</name>
</author>
<author>
<name sortKey="Basu, A" uniqKey="Basu A">A Basu</name>
</author>
<author>
<name sortKey="Satija, R" uniqKey="Satija R">R Satija</name>
</author>
<author>
<name sortKey="Nemesh, J" uniqKey="Nemesh J">J Nemesh</name>
</author>
<author>
<name sortKey="Shekhar, K" uniqKey="Shekhar K">K Shekhar</name>
</author>
<author>
<name sortKey="Goldman, M" uniqKey="Goldman M">M Goldman</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Patro, R" uniqKey="Patro R">R Patro</name>
</author>
<author>
<name sortKey="Mount, Sm" uniqKey="Mount S">SM Mount</name>
</author>
<author>
<name sortKey="Kingsford, C" uniqKey="Kingsford C">C Kingsford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Plass, M" uniqKey="Plass M">M Plass</name>
</author>
<author>
<name sortKey="Solana, J" uniqKey="Solana J">J Solana</name>
</author>
<author>
<name sortKey="Wolf, Fa" uniqKey="Wolf F">FA Wolf</name>
</author>
<author>
<name sortKey="Ayoub, S" uniqKey="Ayoub S">S Ayoub</name>
</author>
<author>
<name sortKey="Misios, A" uniqKey="Misios A">A Misios</name>
</author>
<author>
<name sortKey="Glazar, P" uniqKey="Glazar P">P Glažar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rosenberg, Ab" uniqKey="Rosenberg A">AB Rosenberg</name>
</author>
<author>
<name sortKey="Roco, C" uniqKey="Roco C">C Roco</name>
</author>
<author>
<name sortKey="Muscat, Ra" uniqKey="Muscat R">RA Muscat</name>
</author>
<author>
<name sortKey="Kuchina, A" uniqKey="Kuchina A">A Kuchina</name>
</author>
<author>
<name sortKey="Mukherjee, S" uniqKey="Mukherjee S">S Mukherjee</name>
</author>
<author>
<name sortKey="Chen, W" uniqKey="Chen W">W Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schaeffer, L" uniqKey="Schaeffer L">L Schaeffer</name>
</author>
<author>
<name sortKey="Pimentel, H" uniqKey="Pimentel H">H Pimentel</name>
</author>
<author>
<name sortKey="Bray, N" uniqKey="Bray N">N Bray</name>
</author>
<author>
<name sortKey="Mellsted, P" uniqKey="Mellsted P">P Mellsted</name>
</author>
<author>
<name sortKey="Pachter, L" uniqKey="Pachter L">L Pachter</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Svensson, V" uniqKey="Svensson V">V Svensson</name>
</author>
<author>
<name sortKey="Natarajan, Kn" uniqKey="Natarajan K">KN Natarajan</name>
</author>
<author>
<name sortKey="Ly, L H" uniqKey="Ly L">L-H Ly</name>
</author>
<author>
<name sortKey="Miragaia, Rj" uniqKey="Miragaia R">RJ Miragaia</name>
</author>
<author>
<name sortKey="Labalette, C" uniqKey="Labalette C">C Labalette</name>
</author>
<author>
<name sortKey="Macaulay, Ic" uniqKey="Macaulay I">IC Macaulay</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tosches, Ma" uniqKey="Tosches M">MA Tosches</name>
</author>
<author>
<name sortKey="Yamawaki, Tm" uniqKey="Yamawaki T">TM Yamawaki</name>
</author>
<author>
<name sortKey="Naumann, Rk" uniqKey="Naumann R">RK Naumann</name>
</author>
<author>
<name sortKey="Jacobi, Aa" uniqKey="Jacobi A">AA Jacobi</name>
</author>
<author>
<name sortKey="Tushev, G" uniqKey="Tushev G">G Tushev</name>
</author>
<author>
<name sortKey="Laurent, G" uniqKey="Laurent G">G Laurent</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Trapnell, C" uniqKey="Trapnell C">C Trapnell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Wang, W" uniqKey="Wang W">W Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zorita, E" uniqKey="Zorita E">E Zorita</name>
</author>
<author>
<name sortKey="Cusc, P" uniqKey="Cusc P">P Cuscó</name>
</author>
<author>
<name sortKey="Filion, Gj" uniqKey="Filion G">GJ Filion</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Californie</li>
</region>
</list>
<tree>
<country name="États-Unis">
<region name="Californie">
<name sortKey="Tambe, Akshay" sort="Tambe, Akshay" uniqKey="Tambe A" first="Akshay" last="Tambe">Akshay Tambe</name>
</region>
<name sortKey="Pachter, Lior" sort="Pachter, Lior" uniqKey="Pachter L" first="Lior" last="Pachter">Lior Pachter</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000683 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000683 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:6337828
   |texte=   Barcode identification for single cell genomics
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:30654736" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021